Mechanistic Interpretability - Day 1

Hey, I started with Neel Nanda's exercise problems on mechanistic interpretability, a pathway to get yourself well versed with a close topic in AI Safety.

Prerequisite course - Day 1

I did complete the prerequisite course for my day 1, I'm fully commited to it. So I'm just putting my learner's log here.

Einops

I learnt a lot of einops, and really liked playing around with, It was so intuitive and is really a better alternative to actually be useful unlike using indexes like normal, and figuring out what magic numbers should i put in there.

to use it, all you have to do is einops.rearrange(tensor, "a b -> b a"), a b are dimensions. same thing could be used with, repeat, reduce and others(idok there are other bunch of stuffs, but most commenly anyone will use the 3Rs) Useful for:

Broadcasting

broadcasting is tensor ability to transform if the size of the dimension of the tensor is 1 in the front or anywhere else, it will expand this to match with other tensor if it has some other sized dimensions. it will only work if the size of the dimension is 1 and it should be at the same dimension position that other tensor needs.

for eg: let's say there is a tensor1 with shape of(1,2,4) , another tensor t2 with shape of(3,2,4), the tensor automatically expands tensor1 to 3,2,4 when adding tensor1 to t2.

but if tensor1 shape (1,3,4) and t2(3,2,4) , this won't work, as the second dimension has differnt size. so these would mismatch.

Sample distribution

There were lots of exercises but I really liked this one, because it made me think differently about cumulative probability.

the work i have to do is derive samples of indices using probability distrubtion . These probability distribution is what tells the probability of a indice occuring in the real dataset. I'm using these probabilities to determine the no of times a indice may appear in my sample.

So to take samples, I generated random number of n sample, and did a comparision operation, where for each random number x, if x> (cumulative probability of index i) , then the no of times x is true in the list of cumulative probability distrubution determines the value of x such, like a reinterpreation from the original value x, which will be in between (0, 1) to a index value.

if you do this by hand, it will make sense.

einsum

it's creative.

it can matrix multiply, it can calculate diagonal sum, it can multiply a matrix with vector and lot of cool mathy stuff.

go explore the exercise, and ping me if you have any doubt :).